arXiv:1509.01324v3 [cs.IT] 17 Aug 2016 


Security Concerns in Minimum Storage Cooperative 

Regenerating Codes * 
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Abstract. Here, we revisit the problem of exploring the secrecy capacity of minimum storage 
cooperative regenerating (MSCR) codes under the {H, ^ 2 }-eavesdropper model, where the eaves¬ 
dropper can observe the data stored on h nodes and the repair downloads of an additional h nodes. 
Compared to minimum storage regenerating (MSR) codes which support only single node repairs, 
MSCR codes allow efficient simultaneous repairs of multiple failed nodes, referred to as a repair 
group. However, the repair data sent from a helper node to another failed node may vary with dif¬ 
ferent repair groups or the sets of helper nodes, which would inevitably leak more data information 
to the eavesdropper and even render the storage system unable to maintain any data secrecy. 

In this paper, we introduce and study a special category of MSCR codes, termed “stable” 
MSCR codes, where the repair data from any one helper node to any one failed node is required to 
be independent of the repair group or the set of helper nodes. Our main contributions include: 1. 
Demonstrating that two existing MSCR codes inherently are not stable and thus have poor secrecy 
capacity, 2. Converting one existing MSCR code to a stable one, which offers better secrecy capacity 
when compared to the original one, 3. Employing information theoretic analysis to characterize the 
secrecy capacity of stable MSCR codes in certain situations. 
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1 INTRODUCTION 

Distributed storage systems (DSSs) are an essential infrastructure for the generation, analysis and archiv¬ 
ing of tremendously growing data. DSSs have been becoming a fundamental and indispensable component 
of those rapidly developing distributed networking applications, especially in cloud computing, social 
networking and peer to peer networking. In order to guarantee DSSs’ reliability and availability, data 
redundancy has to be introduced. Replication and erasure codes are two traditional approaches to offer 
data redundancy, while erasure codes can achieve higher reliability for the same level of redundancy when 
compared to replication [T]. Recently, Dimakis et al. [5] employ network information flow to determine 
a class of regenerating codes, which has superior performance over traditional erasure codes regarding 
repair efficiency. 


1.1 Regenerating Codes 

Regenerating codes [2] are a family of codes determined by trading off the amount of storage per node 
with the repair bandwidth. In the regenerating-coding-based DSSs, an original data file of size B is 
encoded into na symbols and then distributed across n nodes. These symbols can be drawn from a finite 
field ¥q and each node stores a symbols. The basic features of regenerating codes are reconstruction and 
regeneration properties, that is, the original data file can be retrieved by contacting any k out of n nodes 
and any failed node can be recovered by permitting a new node to connect to any d helper nodes from 
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the remaining (n — 1) nodes by downloading /3 symbols from each node. Regenerating codes are always 
parameterized by {n, fc, d, a,j5,B} and have the following constraint (tradeoff curve): 

k 

B < min{a, {d — i + l)/3}. (1) 

i=l 


Most of studies now focus on the two extreme points, famous as minimum storage regenerating (MSR) 
codes and minimum bandwidth regenerating (MBR) codes. As shown in [2], the parameters of the two 
points are given by 

(aMSR,/3MSR) = (^, 

. „ , , 2dB 2B , 

(aMBR,/3MBR) = (fc(2d-A: + l)’ k{2d-k + l)>- 

Besides, there are three repair models considered in the literature: functional repair, exact repair, and 
exact repair of systematic nodes [3]. In contrast, exact repair is preferred in the practical systems since 
the lost data in any failed nodes can be regenerated exactly [1]. In the scenario of exact repair, the authors 
in demonstrated the nonachievability of most interior points on the storage-bandwidth tradeoff curve. 
For those interior points that might be achievable, coding construction appears rarely EEl- 

So far, there are many explicit constructions with exact repair property. In [9], the authors utilize 
product matrix framework to propose MBR codes for all parameters and MSR codes under the constraint 
{d> 2k — 2}. In the MSR scenario, much progress has been made. From a global point of view, there are 
two main classes of MSR codes, namely the scalar MSR codes with {/3 = 1} |9ll0lllll2ll3ll4j and vector 
MSR codes with {/? = (n—k)^} where a: > 1 |15llbll7ll8ll9l2()l21) . Most of these constructions are heavily 
built on the concept of interference alignment. According to the analysis in |14j . interference alignment is 
the necessity of constructing scalar linear MSR codes and these scalar linear MSR codes only exist when 
d > 2k — 2. These codes as well correspond to the low rate regime (i.e.,^ < ^ + |)- For designing the 
high rate codes with > i}, the vector MSR codes are applicable as they are free of the parameter 
constraints {n,k). However, many of these vector codes allow efficient repair of only systematic nodes 
Technically speaking, those MSR codes restricted to only efficient systematic repair are 
not formal MSR codes, since the formal ones require that any failed nodes including parity nodes should 
be efficiently repaired. Given this concern, the authors in innHi present vector MSR codes allowing 
efficient repair of all nodes in different ways. In addition to the repair efficiency. Zigzag code m has the 
optimal update property and optimal access property while its variant m also has the optimal access 
property. These two properties are of significant value to practical implementations. Furthermore, locally 
repairable codes lately have attracted a lot of attention due to its practical performance |22l23l24| . 


As we know, all the above repair mechanisms are designed for single node failure. However, it is also 
common that DSSs may experience multiple node failures. Sometimes, DSSs, such as Total Recall [S], may 
take the lazy repair policy, where the repair is triggered only when the number of node failures reaches 
a default threshold. Although most of the existing regenerating codes can in principle be exploited for 
handling multiple node failures by sequentially applying multiple single node repair procedures, however, 
they are not optimal in terms of repair bandwidth as explained in |33) . 


1.2 Cooperative Regenerating Codes 

To allow efficient repair of multiple simultaneous node failures and further reduce the total repair over¬ 
head, Y. Hu et al. [33] propose the cooperative regenerating codes. Different from regenerating codes, the 
repair process of cooperative regenerating codes is divided into two steps which have to handle t node 
failures. In the first step, t new nodes connect to any d surviving nodes, where each new node needs 
to download /3 symbols from each helper node (surviving node). In the second step, these t new nodes 
switch to a process of cooperative repair by exchanging (3' symbols with each other, where the exchanging 
data actually is the function of the repair data obtained from the first repair step. In the terminology, 


2 














the t new nodes are always called as a repair group. Later, the authors in |34I35I36] derive the tradeoff 
curve between storage per node and repair bandwidth for cooperative regenerating codes. Similar to re¬ 
generating codes, cooperative regenerating codes achieving the two end points of the trade off curve are 
termed minimum bandwidth cooperative regenerating (MBCR) code and minimum storage cooperative 
regenerating (MSCR) code respectively. The corresponding parameter set {n, fc, d, t, a, ^,/?', R} of the 
two points are 


I (aMSCR,/3MSCR,/3^SCR) - (j^k{a_k + {yk{d-k + t)^ 

], ^ ,, , y2d + t-l)B 2B B , 

y (aMBCR,/^MBCR,/^MBCR) - (k{2d-k + t)' fc(2d - fc + t) ’ fc(2d - fc + t) 

Here, we make a comparison on repair bandwidth between MSR and MSCR codes. Assume that there is 
a storage system with {n, k, d, B} and t is the threshold on the number of failed nodes. For MSR codes, 
every one of t failed nodes needs to contact any d out of (n — t) surviving nodes and downloads the repair 
data, which totally produces repair bandwidth. For MSCR codes, recovering all the t failed 

nodes needs repair bandwidth in total. By contrast, it is apparent that when t > 1, 


t{d + t— 1)B tdB 

k{d — k + t) ^ k{d — A: -I- 1) ’ 


which exactly means that MSCR codes are advantageous over MSR codes when repairing multiple node 
failures. However, there are not many constructions of cooperative regenerating codes up to now. 

Authors in |3Vl38l3b| present explicit constructions of MBCR codes and the code proposed in [35] 
is built for all parameter settings. In the MSCR scenario, there are only a few constructions 140141142] . 
The construction in (JUj is based on the special parameter settings that k = t = 2 and the one in SH 
is limited to the case d = fc. In [32], the authors establish an equivalent connection between exact MSR 
codes and exact MSCR codes, such that linear scalar exact MSCR codes with {n, k,d — l,t = 2} can be 
built from any instance of linear scalar exact MSR codes with {n, A:, d}. 

Despite the above crucial issues on node failures in DSSs, there always exist security problems since 
massive storage nodes are widely spread across the network. Accordingly, it will be preferable to incor¬ 
porate security requirements during the design of the cooperative-regenerating-coding based DSSs. Our 
concern in this paper is the data secrecy of MSCR-coding-based DSSs. 


1.3 Secrecy Concerns in DSSs 

The active attacker and passive attacker models are the two usual adversary models considered in the 
literature |25| . For the active adversary model, the attacker can take operations on certain compromised 
nodes such as modifying, injecting and deleting. In this paper, we focus on the passive adversary model, 
where an adversary can only eavesdrop the data stored on some h nodes and repair downloads for other 
I 2 nodes. 

Related -work (secure regenerating codes): The authors in [55] and ]27] firstly investigate the 
problem of designing secure DSSs against eavesdropping. In [55], the authors analyze the secrecy capacity 
of regenerating codes, based on an initial adversary model where the contents ol I < k nodes are eaves¬ 
dropped. They derive an upper bound of the secrecy capacity and propose a secure MBR coding scheme 
that can attain this bound: 

k 

^(s) < ^ min{a, (d — i-I-I)^}. (5) 

Afterwards, the authors in m extend the initial eavesdropping model considered in [55], where the 
eavesdropper can also observe the repair downloads for additional I 2 nodes apart from the data stored on 
the initial li nodes, with the constraint that /i -I- Z 2 < k. The secure product-matrix-based MBR coding 
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scheme proposed in m is shown to achieve the bound ([^ only by changing I into li+l 2 - The achievability 
follows from the fact that the repair bandwidth dP is equal to per node storage a in the MBR scenario. 
Furthermore, the authors in [27] considered designing secure product-matrix-based MSR codes, but the 
secrecy capacity of their secure MSR coding scheme is only {k — li — l 2 ){ct — hP), which is evidently less 
than {k — li — l 2 )oi when Z 2 > 0 given in the bound (§. The reason is that the amount of repair downloads 
dp is larger than the per node storage a = {d — k + l)/3 and thus the (Zi,Z 2 )-eavesdropper can obtain 
more information in addition to the contents of (Zi -I- I 2 ) nodes in the MSR scenario. 

Recently, the authors in [^ and [21] employ the analysis of linear subspace intersection and then 
derive new upper bounds on secrecy capacity for MSR codes. Zigzag code m and its variant Hg are 
shown to achieve these new bounds through pre-coding of maximum rank distance (MRD) code [31132] . 
The bounds given in |29| match to those in [28] when I 2 < 2. Thereafter, we [30] utilize the information 
theoretic analysis to give some novel results on the secrecy capacity for MSR codes, which includes some 
new insights on general MSR codes and provides generalized bounds on secrecy capacity for linear MSR 
codes. Thereby, we demonstrate that the secure product-matrix-based MSR codes given in m are also 
optimal whenever h + I 2 < k — 1 and I 2 < d — k + 1. The final outcome on secrecy capacity of linear 
MSR codes that we present in [3^ exhibits to be closely related to the parameter P and applies to all 
known MSR codes including the scalar MSR codes as well as the vector MSR codes like Zigzag code [ig . 
Moreover, it is also applicable to those unexplored vector MSR codes with parameters {1 < ,0 < d—k + 1}. 
Furthermore, we find that all of these results also apply to systematic MSR codes with repair data of 
systematic nodes captured. 

Related work (secure cooperative regenerating codes): In |33|, the authors pioneer the research 
of secrecy capacity of cooperative regenerating codes by min-cut analysis. Similar to MBR codes, the total 
repair bandwidth of MBCR codes under a repair group is also identical to the total storage of the t failed 
nodes. Thus, the secrecy capacity of MBCR codes are fully characterized under the {Zi, Z 2 }-eavesdropping 
model. For MSCR codes, they derived some results on secrecy capacity in some special cases and claimed 
that the two existing MSCR codes [¥MT] can be transformed into secure MSCR codes. However, they 
only considered the information leakage under single repair group and neglected an important detail of 
the repair property in the MSCR scenario. [Jlue to different repair groups involving a node whose repair 
downloads are eavesdropped, the eavesdropper may obtain different repair data sent from a helper node 
to this eavesdropped node, which will definitely result in more information leakage. Even worse, it may 
be impossible for storage system to keep any data secrecy after traversing all possible repair groups. Let 
us briefly describe it as follows. 

Suppose there is an MSCR-coding-based DSS specified by {n, k,d,t = 2, B} and the repair downloads 
of node 1 is observed by the eavesdropper. We let denotes the repair data sent from the surviving 

node j to the failed node 1 under the repair group (l,i), where i 7 ^ j. However, if storage system 
successively undergoes two different repair groups and {IP 2 ) where ii 7 ^ 12 and 7 ^ 

the eavesdropper will observe more data information. In the worst case, the eavesdropper may obtain all 
the original data information only needing to wait for traversing all possible repair groups including node 
1. Thus, it will be difficult or even impossible to retain the data secrecy if this kind of MSCR codes is 
used. 

Contributions: In this work, we study the data secrecy issue of MSCR codes under the {Zi,Z 2 }- 
eavesdropper model. Considering the possible impacts on security mentioned above, we introduce a new 
class of MSCR codes, termed “stable” MSCR codes, where the repair data is restricted to be independent 
of repair group and the set of helper nodes. In order to elaborate the importance of this “stable” property 
to security, we reanalyze the two existing MSCR codes gniT]. We demonstrate that they both inherently 
are not stable. The MSCR code given in HUj actually offers no secrecy at all under the {Zi = 0 ,Z 2 = 1}- 

^ Although a node in different repair groups appears in different repair scenarios and corresponds to distinct 
newcomer nodes, these distinct newcomer nodes corresponding to the same node must appear separately and 
cannot exist simultaneously in the storage system. Since in the model, the eavesdropper is defined to be capable 
of observing the repair downloads of certain nodes at the same time, these newcomer nodes corresponding to 
the same node that, however, cannot appear simultaneously, thus can be viewed as one node if eavesdropped. 
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eavesdropper model, which makes it impossible to be transformed into a secure MSCR code. In addition, 
we find that the other MSCR code given in m has poor secrecy capacity, even also losing any data secrecy 
in some cases. Subsequently, we convert the MSCR code given in [41] to a stable one via adjusting its 
repair strategy. 

Then, we turn to investigate the secrecy capacity of stable MSCR codes. Based on precoding using 
MRD codes, we give an information theoretic expression of secrecy capacity for general MSCR codes. By 
studying the basic properties of reconstruction and multiple simultaneous regenerations for general MSCR 
codes and stable MSCR codes, we derive a series of information theoretic features on the contents of node’s 
storage and the repair downloads. Afterwards, combining these features with the secrecy expression, we 
present a simple expression of secrecy capacity for stable MSCR codes and some specific characterizations 
on secrecy capacity. A similar result given in |43j is a special case of ours when d = k, while the authors 
therein only considered under single repair group. Finally, we calculate the specific secrecy capacity of 
the stable MSCR code built from conversion, which is consistent with our information theoretic results 
on secrecy capacity and is clearly better than that of the original unstable one. 


1.4 Organization 

Section 2 gives preliminaries about system model and adversary model from information theoretic per¬ 
spective. Section 3 exhibits the detailed illustration of two existing MSCR codes. Section 4 presents some 
basic information theoretic properties of general MSCR codes and stable MSCR codes. Section 5 provides 
main results on secrecy capacity of stable MSCR codes. Section 6 concludes this paper. 


2 PRELIMINARIES 

In this section, we describe the system model and the eavesdropping model from information theoretic 
perspective. In addition, we give the definition of “stable” MSCR codes. 

A. Repair Terminology: Consider a DSS consisting of n storage nodes. After t nodes fail, t new 
nodes are introduced to replace these failed nodes. These t new nodes constitute a repair group. Each 
new node connects to any d same surviving nodes and downloads /3 symbols from each of these d nodes. 
In the cooperative repair phase, each new node contacts the other t — I new nodes in the same repair 
group and downloads /?' symbols from each of these nodes. So, the nodes participating in a failed node’s 
repair can be categorized into surviving nodes (the d helper nodes) and cooperative nodes (the other t—1 
new nodes). In addition, the repair downloads involved in the system also can be divided into “repair 
data” (from the surviving nodes) and “exchanging data” (from the cooperative nodes). Here, it should 
be noted that the exchanging data is not necessarily the function of the data stored in the original failed 
node and actually is the function of the repair data of the corresponding new node. 

The following is the parameter notation of cooperative regenerating codes {n > d + t,k, d, t, a, 13, (3'}, 
which is reduced to the scenario of regenerating codes when t = 1. Based on the repair process, there are 
totally (”) possible different repair groups. Fig Ill describes the basic system model with some parameters. 

B. Parameter Notations: Given any cooperative regenerating code with parameter set {n > d -|- 
t, k, d, t, a, 13,13'}, we let 

( 1 ) Wi,i S [l,n] denote the random variable corresponding to the content of node i, which has that 

i^(IF^) = a. 

(2) {Wa,A C [l,n]} denote the set of random variables corresponding to the nodes in the subset A. 
Throughout the paper, subscripts of W can represent either a node index or a set of nodes which will be 
clear from the context. 

(3) Si,{i,j} C [1 ,n],i ^ j denote the random variable corresponding to the symbols of repair data 
sent by the surviving node i to new node j, where H{Sf) = 13. 

(4) denote the set {S^\i G A,j G B,i ^ j,A C [l,n],B C [l,n]}, and particularly substitutes 

for 
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Repair Phase 



Fig. 1. C = (ii,-'' :*t) is one repair group and D = ijd) is one set of helper nodes, where C is 

disjoint with D. In the first repair phase, each new node in C downloads /3 symbols from each helper node 
in D, i.e., ,S^). In the cooperative repair phase, each new node mutually exchange [5' symbols, i.e., 

• • • , Thus, the total repair downloads for each new node in C is for 1 < I < t, 

which is used to recover Wi^ the original storage of failed node ii. 


(5) ^iAhj} C [1, n],i ^ j denote the random variable corresponding to the symbols of exchanging 
data sent by the new node i to another new node j, when node i and node j are in the same repair 
group, where H{S^) = /3'. 

(6) S_^ denote the set {S^^\i & A, j £ B,i ^ j, A C [l,n],B C [1, n]}. 

Remark 1 Compared to regenerating codes, cooperative regenerating codes have another parameter that 
is the exchanging data . According to the above notation of the exchanging data S_j and the procedure 
of the cooperative repair, it must be that, for any repair group C and any helper nodes set D where i G C 
and -DC [1, n] \ C, 

r 151,) = 0 

where the first term means that exchanging data is the function of the repair data of node i and 

the second term implies that node i can be regenerated by the repair data Sf) as well as the exchanging 
data 5c\{i} • 

In addition, for any {n > d + t, k,d,t,a, P, l3'} MSCR code, it must he an MBS code (reconstruction 
property) and have the regeneration property that any t failed nodes can be repaired simultaneously. These 
two basic properties can he expressed as 


\R(tFc|5g)=0, 

where C and D are defined as equation When n = d + t, D is unique after the choice of C. 

C. Eavesdropping Model: We consider an {Zi, Z 2 }-eavesdropper, which has access to the storage 
contents of nodes in set E and additionally can observe the repair downloads of nodes in set F, where 
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\E\ = li, |J^| = I 2 and h + I 2 < k. Besides, we set G to be another nodes set of size {k — li — I 2 ), where 
G C [l,n\\{EUF). 

However, different from regenerating codes, the repair downloads of any one node in F here are 
comprised of the repair data from d helper nodes and the exchanging data from t — 1 cooperative nodes. 
As shown in Figure. there are totally ("Zi) possible sets of the cooperative nodes after deciding one 
failed node and ("Z*) possible helper nodes sets after determining a repair group. Thus, after traversing 
all possible repair groups and the sets of helper nodes, the {^i, Z 2 }-eavesdropper is supposed to have the 
knowledge 


{wE,{Sh,^c\U}\^(^CnF,Gc[l,n],Dc[l,n]\G,\G\=t,\D\=d}}, (8) 

where (7c[l,n] indicates that C traverses [l,n] and so does D. For brevity, we substitute {■S'}), |* G 

Cn F, Cc[l, n], I?c[l, n] \ G, IC] = t, \D\ = d} for and thus {We, 5^} is the data information leakage 
obtained by eavesdropper. In [35], the authors only consider the eavesdropping model under single repair 
group. 


Nodes under eavesdropping 



One set of helper nodes D 


I ; One repair group C 


Repair Downloads of node f +1 given a repair group C and a helper nodes set D 


Fig. 2. E is the nodes set whose contents are eavesdropped and F is the nodes set whose repair downloads 
are observed by eavesdropper. Given a repair group C including node b + 1 and a set of helper nodes D, 
red lines indicate the repair data and blue lines stand for the exchanging data which con¬ 

stitute the total repair downloads of failed node b -I- 1. For all possible repair groups and the sets of helper 
nodes, the repair downloads of node b + 1 that the eavesdropper may obtain is {*5'^^^, |b -|- 1 G 

C, C'c[l, n], b>c[l, n] \ C}. Thus, for the eavesdropped nodes set E and F, the total information may leaked to 
eavesdropper is | Wb, {Sd, G F,l G C, Cc[l, n], I?C[1, n] \ G}!. 


D. Security Consideration: Based on the above eavesdropping model, we consider a special class 
of MSCR codes, where the repair data sent from any surviving node i to a new node j is independent 
of the choice of the other t — 1 cooperative nodes and the other d — 1 helper nodes. That is to say, the 
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content of single repair data is fixed and only depends on the helper node index i and the failed node 
index j. However, we do not restrict the content of exchanging data S_l also to be invariant, i.e., it may 
vary depending on different repair groups including both nodes i and j. Nevertheless, we will show that it 
does not matter if the exchanging data is restricted to be fixed or not, when considering the total amount 
of information leakage. 

As discussed before, this restriction of repair data is important for the MSCR codes to be secure, 
since the {li,l 2 }-eavesdropper can get access to the repair downloads of the nodes in F. Otherwise, the 
changing contents of repair data {Sj^j € A} will cause more information leakage due to different repair 
groups or different sets of helper nodes, which is certain analogous to the situation of functional repair 
and may make it impossible to maintain the security of MSCR codes. Based on this security concern, we 
define such an MSCR code as 

Definition 1. (Stable MSCR Code): A stable MSCR code with {n > d + t,k,d,t,a, (d, fd'} is an MSCR 
code with the “stable ” repair property, that is, for arbitrary repair group C including j and arbitrary set 
of helper nodes D including i, the content of repair data Sj is independent of the choices of C and D, 
where i j G [l,n]. 

In next section, we will reconsider the two MSCR codes |4ni41j . while the authors in [43] only consid¬ 
ered under single repair group and neglected this “stable” property of MSCR codes. 


3 ILLUSTRATION OF EXISTING MSCR CODES 

In this section, we reanalyze the secrecy capacity of the two MSCR codes gnmi, whose detail on the 
stable property is overlooked in [43]. Both MSCR codes |40l41j will be shown not stable. The MSCR 
code proposed in |3D] will be further shown impossible to be transformed into a secure MSCR code under 
the {li = 0,^2 = l}-eavesdropping model. As for the one in [IT], its original repair procedure is also not 
stable, but it can be converted to a stable one through adjusting the repair strategy. 


3.1 Unstable MSCR Codes 

Here, we take the two MSCR codes as examples and explain why they are not stable and why it is hard 
or even impossible for them to maintain the data secrecy under the {Zi,/ 2 }-eavesdropping model. 


3.1.1 MSCR-Code-A. The authors in [43] first investigated the secrecy capacity of the MSCR code |40| 
with special parameter {d > k = t = 2}. Under the constraint that li+h < k = 2, they analyzed two cases 
respectively, i.e., {h = 1,^2 = 0} and {li = 0,^2 = !}■ The hrst case {li = l,l 2 = 0} is trivial, as there 
is only some node’s content undergoing eavesdropped and does not involve the information leakage of 
repair downloads. Thus, the construction of secure MSCR code under the {h = l,l 2 = 0}-eavesdropping 
model given in |43j is correct. 

As for the second case {h = 0, Z 2 = 1}, they considered under single repair group made of two 
systematic nodes. However, they overlooked the fact that the content of the repair data transferred for 
one systematic node, changes with different repair groups which could include the same systematic node 
but another parity node. In the following, we first describe the coding scheme and the repair strategy as 
given in m. then we show that this code in m cannot be transformed into a secure MSCR code under 
the {h = 0,^2 = l}-eavesdropping model. 

• Coding Scheme: The coding scheme is specihed hy {k = t = 2, fd = 1}, from which it has the 
special parameter setting with {a = d—k + t = d = n — 2,B = k{d — k + t) = 2a}. Keeping the notation 
used in |33|, the procedure is described as follows: 

*: a = (oi, 02 , • • • , is systematically stored in the first node, 
b = ( 61 , 62 , ■ • • , ba)^ is systematically stored in the second node. 








*: = (d“ 6 a)^ is stored in *th parity node, where i G [l,d] 

and w is the generator of a finite field F^. For convenient index, the ith parity node is marked as the 
(i + 2)th node, * G [1, d]. By matrix representation, = a + B^b, where Bi is the corresponding diagonal 
matrix. 

• Repair Strategy: The detailed coding construction can be referred to m and we only care 
about its repair process. As described in |40) . they only consider the repair group comprised of two 
systematic nodes. Other repair groups including parity node can be performed as the two systematic 
nodes after change of variables. Assume the repair downloads of the first node (node 1) is observed by 
the {^1 = 0, ?2 = l}-eavesdropper. Under repair group (1,2), the repair data sent from the jth parity 
node to node 1 is given by 


^3+2 


T 

^ 1 , 3^3 



(9) 


where they set z = ( 1 , • • • , 1 )^ and z^b is termed an interference needing canceling out. 

Now, we consider other situations when a repair group is comprised of the first node and the fth 
parity node where i ^ j. As suggested, we should view {a, r^} or {l,i + 2} as two systematic nodes. For 
simplicity, we let x = a and y = r^. After changing variables, we have 


b = -B-ix + BrV 

r, = (I-B,B-i)x + B,B-^ 


( 10 ) 


where I is the identical matrix. In order to ensure the alignment of interference, the jth parity node now 
should send to node 1 under repair group ( 1 , i + 2 ) by 


= z'^(BjBj. ^ - I)x + z^y = z^Bi(B^- ^a + b) 


( 11 ) 


where z^y now is viewed as an interference. Similarly, the second systematic node (whose storage is b) 
should send to node 1 under repair group ( 1 , i + 2) by 


^(l,i + 2) 


B,b = z^B,(-B; 


■B-V) = -z^x 



( 12 ) 


where z^y needs to be canceled out. 

• Data Eavesdropped: Under the {li = 0,^2 = l}-eavesdropping model, when the repair downloads 
of node 1 is eavesdropped, the total data eavesdropped (that in fact is all the repair downloads of node 
1 under all possible repair groups) is comprised of the repair data of node 1 from the helper nodes and 
the exchanging data from the corresponding cooperative nodes. As shown in |43j . under the single repair 
group made of two systematic nodes ( 1 , 2 ), the information symbols observed by the eavesdropper is given 
by 


{z^(i/(a;° + • • • + a;“ ^) + b), z'^(Bj + b), z'^(B 2 + b), • • • , z'^(B^ + b)} , (13) 

where z^(j 2 (a;° + • • • + a;““^)”^a + b) is the exchanging data from node 2. Next, we will show that, 
the already obtained content of node 1 combined with the repair data sent from any one helper node 
to node 1 after traversing all possible repair groups are enough for eavesdropper to retrieve all the data 
information. In other words, only needing the repair downloads of node I under any one repair group 
and all the repair data sent from node j + 2 to node 1 under all possible repair groups, the eavesdropper 
can recover all the original data information. 
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[z^ {B:'a+ b), Z^B. (B:'a + b)\i ^ j s[ld]} 



Fig. 3. Under different repair groups including node 1, the node j + 2 (or the j’th parity node) sends different 
contents of repair data 51+2 to node 1, which will leak more data information to the eavesdropper. 


As illustrated in Fig.[^ after traversing all possible repair groups including node 1, the eavesdropper 
can totally obtain {d = a)-sized repair data from the jth parity node to the first node that are 

= [z^(B-ia + b),z^B,(B-ia + b)] [l,d]} , (14) 

which is equivalent to 


(a^B-i+b^).[z,Biz, 


, Bj_iz, Bj_j_iz, 


,Bdz]. 


(15) 


Here, it should be noted that the eavesdropper not merely can obtain the repair data sent from the jth 


parity node as the formula (151, but also can observe other repair downloads including the repair data 


and the exchanging data which are sent from other helper nodes and cooperative nodes. Although the 
information leakage as in the formula (15) differs from the formula (13) given in [33], it is now clear 


for us that both information leakage formulas (13) and (15) actually are only parts of the total data 


eavesdropped under all possible repair groups. The reason of here only considering the repair data sent 


from the jth parity node as the formula (15) is that, the eavesdropper has been able to sufficiently decode 


the original data information, only using the already known content of a and these information symbols 


of the repair data as the formula (15). It is illustrated as follows. 

Required by the coding construction in |3D|, the following a x a matrix 


[z,Bi ^z,- ■ 


-1 


-1 




(16) 


should be invertible, which, as stated in 


a-l)2^-(a-l) 


can be guaranteed by the condition that q > n — 1 and 


(a;° + • • • 

following matrix from the formula (15) 


^ {0, a^}. Actually, based on this condition, we can also deduce that the 


[z, BiZ, • • • , B,_iz, B 


j+iz, • • • 


,Bdz] 


(17) 


is invertibl^El 


^ Proof'. First, ^ is a diagonal matrix whose diagonal elements are “}. Then, 


matrix (17) can be equivalently transformed into matrix (16), if uj~ is regarded as the generator of the hnite 
held Fq. At last, if uj~^ satishes + uj~^ • • • + ^ {0,a^}, matrix |l7[ ) is invertible. For this, 

we can easily hnd the clue that (aj° + • • • + .(a-i)\ 2 , .-(a-i) _ 








(a;° + a;-U- - + a7"“)^(a;(“-^)) 
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Therefore, the eavesdropper can obtain the content of (a^B ■ ^ +b^) just only by solving the equation 


(15). In fact, the content of (a^B + b^) include all the storage information of node j + 2, since 


Tj = a + Bjb. Then, combining the already obtained content of a under any one repair group, he thus 
can obtain the content of b. That is to say, the {h = 0, /2 = l}-eavesdropper can obtain all the information 
of the original data message (a, b), as long as by observing the repair downloads of node 1 which undergoes 
all the repair groups (1, 1) for I G [1, d + 2] \ {1, j + 2}. In this case, we cannot implement one-time pad 
scheme to encrypt or randomize secure information symbols as used in |43j . since all the information 
symbols have been eavesdropped and there are no secure information symbols left. Hence, this MSCR 
code in [40] cannot be transformed into a secure MSCR code under the {li = 0, Z 2 = l}-eavesdropping 
model. 

3.1.2 MSCR-Code-B. The authors in |43| then investigated the secrecy capacity of MSCR code given 
in IH] with {d = k,a = t, P = 1}, which actually is also not stable. 

• Coding Deployment: As shown in [IT], the k ■ t original data packets are deployed in a t x A: 


data matrix M and its row representation is denoted by (m 


1 )^ 12 , 


matrix 


G = 


1 

Oi 


1 

02 


1 

03 


4-1 




, m^). Consider a, k x n generator 


(18) 


^2 ^3 

of which every k x k submatrix is a non-singular Vandermonde matrix. Then encode the original data 
matrix into MG and the encoded data packets stored in node j are {mfgjji = 1,2, • • • ,t}, where gj is 
the jth column of G. 

• Repair Strategy: When t nodes are failed, t new nodes contact any other d = k surviving nodes, 
where the t new nodes are indexed by {/i, • • • , ft} and the k helper nodes are indexed by {Ai, • • • , A^}. 
Each helper node A; sends its jth packet to the new node fj with mjg^,, for I G [1, A:]. Because of the 
property of Vandermonde matrix, mj can be recovered by reversing the matrix [gAngA 2 ) ’ ’ ’ )§>*,]• In 
the cooperative repair phase, the new node fj sends mjg/^ to another new node ft, for i j G 
Thus, the new node fj can receive t — 1 data packets {mfg/^|i j G [1,A]} during cooperative repair 
phase. Combining the previously obtained mJ, the initial state of node fj can be recovered. 

• Data Eavesdropped: According to the repair process, we find that the repair data from a helper 
node Xi to a new node fj is mJgA,, where j G [l,t] and fj G [l,n]. That implies the mapping of fj is not 
bijective. Besides, there are totally (") possible repair groups. So, there must exist two different repair 

groups {/i,-- - ,/i} and {/(,••• ,/(}, where fj ^ f' and = mJgA,} for some j. However, 

f f' 

when node fj and /j are in the same repair group, and cannot be equal to mJ gA, simultaneously. 
In other words, we cannot guarantee that repair data from any helper node to any failed node is always 
fixed, which exactly means this MSCR code is not stable and will leak more data information if the 
eavesdropper can observe the repair downloads of the corresponding node. 

As shown in Fig. for repair group [1,A], we set Sl }}'2 = mfgt +2 for i G [1,A]. For another repair 
group [2,A-|- 1], we set = ^i’st +2 and = mfgt +2 for i G [2,A]. However, when node 1 

and node t-I-1 are in the same repair group such as [1,3,'-- , A-I-1], and cannot 

equal with gt +2 simultaneously. 

As stated in im, any A new nodes are put in order by their serial numbers. In fact, such an order 
arrangement is the least secure way. For example, if n > 2A -|- A: — 1, when repair group [1,A] gradually 
traverse to repair group [A, 2A — 1], the repair data sent to node A from helper nodes set [2A, 2A -I- A; — 1] is 
given by 


{^A 


H+1- 


igA I i G [1,A],A G [2A,2A-h A; - 1]} = {mfgA,- - • ,mfgA|A G [2A, 2A-h A; - I]}, (19) 
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[mf g ,, 2 , ml g ,+ 2 , •••,«?[§,+2 



Fig. 4. In the repair group [l,t], St +2 = nifgt+ 2 . In the repair group [2,t + 1], ~ ^T%t+ 2 - However, in 

the repair group [1,3, •• • ,t + 1], we can only set {Sl +2 = mfgt+ 2 ,S'*j 2 = m|’gt+ 2 } or {S'tV 2 = T^ 2 Zt+ 2 ,SlXl = 
mfgt+ 2 }, which indicates one of repair data St+2 and S'f ^2 must change its content. If the eavesdropper observes 
repair downloads of the node that has changing contents of repair data, it will obviously obtain more data 
information. 


which, if observed by eavesdropper, can be used to decode all the original data packets (mi, m 2 , • • • , m^) 
since [g 2 t,’’' )g 2 t+fc-i] is invertible. It means that the eavesdropper can obtain all the original data 
information only by observing the repair data of node t involved in repair groups as many as possible. 

Remark 2 Although the MSCR code given in m is not stable and possesses poor secrecy capacity, it 
can be converted to a stable one by adjusting its repair strategy, which will ojjer better secrecy capacity. 


3.2 A Stable MSCR Code 


In this section, we will present a stable MSCR code built from conversion of repair strategy based on the 
MSCR code given in m- 

We apply the same coding deployment but change the repair strategy, where the main purpose is 
to make the content of repair data invariant to the choice of helper node A; and failed node fj. 
In other words, we need to ensure the bijection between indices of failed nodes and repair data pack¬ 
ets given by a helper node. Thus, after the coding deployment, we consider a systematic MDS code 
(rn'i, m^, • • • , m(, m(_|_i, • • • , m(j) which is extended by the original data packets (mi, m2, • • • , m*), where 
(m'l, m^, • • • , m() = (mi, m2, • • • , m^). For this, we can use a t x n generator matrix G' 


1 0 • • • 0 vi^t+i 1 ^ 1 ,t+2 ■ ■ ■ l^l,n 
0 1 • • • 0 V2,t+1 V2,t+2 ■ ■ ■ V2,n 


0 0 • • • 1 Vt,t+l ^t,t+2 • ■ ■ l^t,n 


( 20 ) 


of which every t x t submatrix is invertible. We let g' denotes the jth column of G'. Here, it should be 
noted that G is a fc x n matrix, while GMs a t x n matrix. So, we have 


[mi,m2,- • • ,mt] • G' = [m'i,m^,- - • ,m(,m(+i,- • • ,m(j], (21) 

from which we can derive, for any i G [ 1 , n — t], 

m(+j = [mi, m2, • • • , mt] • + i22,t+im2 H-h vt,t+imt. ( 22 ) 
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The following is the new repair strategy which is also shown in Fig. 



Fig. 5 . Given a repair group {/i, ■ • • , ft} and a set of helper nodes {Ai, ■ • ■ , Afc}, is the repair data sent 

from node A; to node fj. Subsequently, each new node fj sends to another new node ft, where * 7 ^ j £ [1, t]- 

Each new node is then recovered exactly, by combining all the repair data and the exchanging data. 


Step 1. For any repair group {/i, • • • , ft} and any set of helper nodes {Ai, • • • , Afc}, each helper node 
\i sends to the new node fj with (m}’ gAi, ’ ’' i SAi ) ’ §/, i where 


= g/, ■ (niigA,,--- gAj 

< = g/^ ■ [mi,m2, • • • • gA, (23) 

= ([mi,m2,--• ,mt] ■ ■ gxi 

fT 


where (mj^gA,• ,mfgx^) is the exact original storage of node A; and m'^ is from equation (211. So, 
the repair data = m}^gA,} now actually is the linear combination of storage in node A/, while the 
original repair data is mJgAi (the jth data packet of node A/). Furthermore, due to the invertiblity of 
any k x k submatrix [gAi, • • ■ ,gAfc] of G, the linear combination of original data is obtained. 


Step 2. In the cooperative repair phase, the new node fj sends exchanging data m.'^,gf. to other new 
nodes ft, for i j £ Hence, the new node fj can receive t — 1 data packets {m'^^gf.\i j & [l,t]} 
in this phase. 


Step 3. At last, node fj combines the repair data and exchanging data {m'^, 
to obtain {ni'^^gf.\i € [l,t]}, which can be further expressed as 

= {[mi,m 2 ,--- ,mt] • [g},,g},,--- ,g}J}^-g/, 

= [g/i,g/,,--- ,gy'^ • [mi,m2,--- ,mtf-gj^ 

= [g/i,g/2,--- ,g/J ■ ( mig/^.,--- ,mt g/J , 


7^ j e [l,t]} 


(24) 
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where (m^ g/,, • ’' i rnf Sfj) is the original storage of node fj. As any t x t submatrix , g^^, • • • , g^^J 
of G' is invertible, node fj can be recovered. 

Remark 3 According to the above new repair strategy, it is obvious that the content of repair data from 
any helper node Xi to any failed node fj (5'^^ = m^gA, = (m^gA, , • • • ■ g^,) is independent of 

repair groups and sets of helper nodes. So, this MSCR code built from conversion of repair strategy is a 
stable MSCR code. 

In subsequent discussion, we study the secrecy capacity of stable MSCR codes from information 
theoretic perspective. Besides, we will use the above stable MSCR code to calculate its specific secrecy 
capacity. 

4 INFORMATION THEORETIC FEATURES OF MSCR CODES 

In this section, we first present a generally applicable secrecy expression for MSCR codes. Then, we 
present some information theoretic features based on the basic reconstruction and regeneration properties 
of general MSCR and stable MSCR codes. 

4.1 Expression of Secrecy Capacity 

As assumed in eavesdropping model, the {li, Z 2 }-eavesdropper has access to the following information 

{We, S^} = {We, {Sh,^c\w N e C n A, Cc[l, n],Dc[l, n] \ C, |C| = t,\D\ = d}} . (25) 

Similar to the definition of secrecy capacity of MSR codes m, we have the following result. 

Lemma 1. For any MSCR code with parameter set {n > d + t,k, d, t, a, j3, fd'}, we have 

= H{We,We,Wg\We,S^) 

= H{Wg\We,Wf)-H{S^\We,We) 

. =[k-h-h)a-H{S^\WE,WE) 

Proof. First, we can use the MRD codes [35] (e.g. Gabidulin code m) to pre-code the original data file 
of size {B = ka}, which is required to consist of {B — H{We, 5^)}-sized secure data file and H{We, S^)- 
sized random data file. As shown in [27128143] . this kind of construction of secure codes always can meet 
the conditions of secrec?]^ which exactly means the maximal file size that can be securely stored is 

rW =B- H{We, S^) = H{We, We, Wg\We, S^). 

Second, we can deduce 

' H{Wg\We, We) - H{We, We, Wg\We, S^) 

= H{Wg\We, We) - H{We, We, Wg\We, We, S^) 

= H{Wg\We,We)-H{Wg\We,We,S^) 

= I{Wg-,S’"\We,We) 

= H{S^\We, We) - H{S^\We, We, Wg) 

^ =H{S^\We,We). 

^ Consider a DSS with data file f®, random data file r (independent of f®), and an eavesdropper with observations 
given by e. If H{e) < H(r) and R'(r|f'’,e) = 0, then the mutual information leakage to eavesdropper is zero, 
i.e., /(P;e) = 0. 


(27) 


(28) 
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Then, for the MSCR codes, we further have H[Wg\Wej Wp) = {k — li — l 2 )a, where a = {d— k + t)f3. 
Combining these equations, we get the proof. 

Remark 4 Based on this definition of secrecy capacity, we only need to calculate or estimate the value 
0fHiS^\WE,WF). 


4.2 Properties of General MSCR Codes 

We present some properties of MSCR codes as below. 


Lemma 2. For any MSCR code with parameter set {n > d + t,k, d, t, a, /3, fi'} where t < k, consider any 
three pairwise disjoint subsets A, B and C with {|C| = t,\A\ = k — t, \B\ = d — k +1}, it must be that 


(H{S^^s) = dtl3 

\HiS^\Wc,S^)=0. 


(29) 


Proof. We present them as follows. 

1 . Because MSCR codes are the storage efficient codes with the MDS property, it is trivial that 
H{Wc\S^) = H{Wc) = ta since |^| + \C\ = k and AcC = 0. 

2. Set B = {5i, 62 , • • • , bd-k+t\- From equation Q, we know H{Wc\S^tjg) = 0. Now, we have 

' H{Wc\Si)-H{Wc\S^A.SZ) 

= i{Wc-,sg\s^;) 

< 0 ; 

< ; ; ( 30 ) 

H{Wc\S^, Sg,Sg,---, - H{Wc\Sg, Sg) 

^ < 0 . 


By summing up the inequalities, we derive 

ta = H{Wc\SZ) - H{Wc\SZ, SZ)<id-k + t)tfi. (31) 


Because a = {d — k -\- t)fj, it is mandatory that all the inequalities p0| actually are equations. Thus, 
for 1 < i < d — fc +1, we all have 




from which we further obtain 


'H{sZ\sZ) 

i—d—k-\-t 

, = {d — k t)tj3 


(32) 


(33) 
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and 


'H{S^\Wc,S^) 

< = 5] (34) 

i^l 

. = 0 . 


According to equation (33), we further know H{Sg) = {d—k+t)tf3, with 
for any b G B. Due to the randomness of the choice of the two sets A 
H{S^) = {k — t)tp for \A\ = k — t < k. Thus, combining equation (33), we 


which we obtain H{S^) = tjd 
and B, we can also deduce 
get 


= H{S^)+HiS^\S^) 

= {d — k + t)tl3 + {k — t)t/3 

= dtp. 


Based on the above proof, it is obvious that equation (29) still holds, when t = k and A = 0. 


Remark 5 Since it is trivial that < dtp, equation (35) exactly means that there are no in- 

terseetion pattern within the repair data i.e., all the contents of repair data S'^ub mutually 

independent when t < k. In addition, we have the following observations: 


1. When t < k, equation \35^ further implies that dtp < ka as the total information entropy of data 
storage is ka, which leads to {d — k){k — t)P > 0. When k > t, it must be that d > k. When t = k, if 
d < k, the two terms of equation 0 will be contradictory. Thus, it must be that d> k when t < k. 

2. When t > k, the second term of equation §1 H{Wc\S%) = 0 means that ka < dtp, which is 
equivalent to {d — k)(t — k)P > 0. Hence, it also can be derived that d> k in this case. 


3. Both cases show that there do not exist MSCR codes with d < k. 


Furthermore, it is interesting to find that when t > k and d = k, it must be that H{S%) = dtp, because 
a = {d — k + t)P = tp which leads to ka = H{Wc) < H{S^) < dtp = ka. In other words, there are also 
no intersection pattern within the repair data when t > k and d = k. 


Lemma 3. For any MSCR code with parameter set {n > d + t,k, d, t, a, P, P'}, consider any single repair 
of node i in a repair group {i, C'} and two other disjoint subsets A! and B' such that {|C"| = t — 1, |A'| = 
fc — 1, |i?'| = d — k + I, {A' U -B') n C' = 0, i ^ {A' U B' U C'}}, it must be that 

( = {d + t — 1)P , , 

\H{Sh„Sfc,\W„Sl,,)=0. 


Proof. We let B' = { 6 ), ■ • • , and C = {c'l, • • • , c(_i}. Then, we have 
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' H{w.\s:,,)-H SI,j 

</?; 


(37) 




and 


< H{Sl, ) 


H{W.\S\,,S^b,)-H{W,\S\,,S^b',^cO 

<H{S!cO 

</ 3 '; 

H{W.\S\,^b',^',.- ■ ■ - H{W.\S\,^b,,S!c.) 

= - H{^c',JW.,S\,^b',S^c'\{c',_,}) 

</3'. 


(38) 


By summing up all the inequalities (37) and (38) along with the fact that /3 = /?' in the MSCR 
scenario, we derive 

a = HiW,\S\,) - HiW,\S\,^s,,S}.,) < (d + t-k)^, (39) 

from which all the inequalities ( [37| ) and ( [3^ mandatorily become the equations similar to Lemma 
Thus, we get the proof. 


(40) 


Remark 6 According to the second term of equation (36^ , we naturally derive 

\h{S^C'\W,,S\,)=0, 

using which we can further simplify H{S^\We, Wp)- 


4.3 Properties of Stable MSCR Codes 

Some properties of stable MSCR codes are present as follows. Here, we should know that stable MSCR 
codes also have the above properties of general MSCR codes in Lemma[^and[^ since stable MSCR codes 
still are MSCR codes. 


17 







(41) 


Lemma 4. For any stable MSCR code with parameter set {n > d + t,k, d, t, a, /3, /3'}, we 


S^ = {Wf.S^}, 


from which we further obtain 

H{S^\We. Wf) = H[S^\We. Wf) = HiS^lWE, Wf) = H{S^), (42) 

where G is a set of size (k — li — 12 ) and is disjoint with E and F as defined in the eavesdropping model. 


Proof. The proof is separated into two parts as below. 

1. First, we know for any i G F, = {S'!,, |j G C, Cc[ 1, n], Z1 c[1, n] \ C, \C\ = t, \D\ = d}. The 
“stable” property of MSCR codes will lead to that 

e C,cc[l,n], |C| = t}, (43) 


where we claim again that exchanging data {Sfj\i,j G C} does not have the “stable” constraints and may 
vary depending on different repair groups C. In addition, it must be that i?(Wi, S’*!!?®) = 0 from equation 
1^. The following shows that the exchanging data {5^\^{j}|C'c[l,n], \C\ = <} is a function of the content 
of where can be replaced by S\ 

For any repair group C including i, there always exists some set A" such that A" n C = 0 and 
\A"\ = fc — 1, because d> k. Then, according to the second term of equation (40) in Lemma|^ we have 


H{^cw}m,S\„)=0. 


(44) 


Thereby, we derive 

' H{S^\W,,S^) 

. = 0 . 


(45) 


Therefore, from H{W^,S^\S^) = H{S^Wi,S^) = 0, we naturally have {5®} = {Wi,5'®} and further 
get = 

2. Assume all the n nodes are comprised of E, F, G, T, where jif U F U G| = k and |T| = n — k. So, 
we have 

' H{S^\W[e,f}) 

= HiWF,S^\W^E,F}) 

— H{S^pQrp\W[E,F}) 

= H{Sqf\W[e,f}) 

^ = HiS^\W[E.F}) + HiS^\W[E,F},S^). 


■* Lemma shows that it does not matter if the exchanging data in the stable MSCR scenario is restricted 
to be fixed or not, because the exchanging data G C, Cc[l,n]} is only a function of the content 

of {IFi, S'j]^ In other words, the total information of the exchanging data G C, Cc[l,n]} is 

included in {Wi,Sf So, when calculating the amount of the eavesdropped data, we do not need to 

consider the exchanging data |* G C, C'c[l, n]}, while we only need to focus on the combination of some 

node’s storage and its repair data {Wi, S'®}. 
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Then for any i G F, 


<HiS^E\WiE,F},Sh) 

= HiSt,\W,,W^E,F}\{i}.Sh) 
<H{St,\W,,SlE,E}\{^},Sh) 
= H{S^Em,SlE,E,G}W})- 


(47) 


Based on the first term of equation (40) and the fact that \{E, F, G} \ {i}| = fc — 1, we obtain 

HiSUW[E,F}.S^) = 0, 


(48) 


where T' can be any subset of T of size d — fc + 1. Owing to the randomness of T', we can deduce that 
H{Se\W[e f}:Sq) = 0, which further leads to H{Se\W[e f}:Sq) = 0. Furthermore, it is trivial that 
H{S^\We,Wf) = H{S^). 

Remark 7 From the above proof, we can easily find that the formulation F[{S^\Wf) = H{Sq) still 
holds, when E = 0 and |F" U G| = k. However, it should be noted that, unlike MSR codes, MSCR codes 
do not necessarily have the property that H{Wi\S'^) = 0. MSCR codes only have a similar format that 
H{Wi\H) = 0 instead. 


Lemma 5. In the stable MSCR scenario, for any subset F such that jF’l < k—1, and arbitrary different 
ii,i 2 where ii,i 2 ^ F, we have H(Sf^) = H{Sf^). Furthermore, we have 

• When t < k, for any |F)| < t, we always have H{Sf) = \F\f3, where i ^ F. 

• When t> k and d = fcjj/or any |_F| < t, we still have H{Sf) = \F\I3, where i ^ F. 

Proof. We present them as the following two parts. 


1. From Lemma 1^ and Remark]^ we have 

H{S^) 

= H{S^,Wf) 

< = H{Wf)+H{S^\Wf) (49) 

= H{Wf) + H{S^,\Wf) 

^ =H{WF)+HiS^,), 


where G' is a random subset of [l,n] such that \G' U F\ = k and G' Cl F = 0. Since \F\ < k — 1, then 
\G'\ > 1. 

When \G'\ = 1, for any two different gi and g 2 where 51,32 G {[l,n] \ F}, 

H{S^) = H{Wf) + = H^Wf) + H{S^J, (50) 

which indicates H{Sg^) = H{Sgff. 

When \G'\ > 2, we set G' = {g', Gi} and G" = {g", Gi} such that {g' g", |G'| = |G"| = fc- |F|, G'n 
F" = G" n F = 0}. Similarly, we obtain 


' H{S^) 

= H{Wf) + H{S^,) 

= H{Wf)+H{S^,) + H{S^^)- 
H{S^) 

= HiWF) + H{S^„) 

^ =H{Wf)+H{S^„)+H{S^J, 


(51) 


® In the situation when t > k and d = k, we should know that if fc < |F| < t, the formulation that H{S[) — \F\P 
still holds. 
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which implies H{Sp) = H{Sg„). 

Because of the randomness of choices of ( 31 , 32 ) and {g',g''), we have = H{S^) for arbitrary 

different ii,i 2 where ii,i 2 ^ F. 

2. Remark]^ in Lemmashows that in the situations when t < k or when t > k and d = k, contents 
of any repair data (from any helper nodes set D to any repair group C) are mutually independent. Due to 
the random choices of C and D and the stable repair property, we obtain for any |F| < t, H{Sf) = \F\f3, 
where i ^ F. 


5 MAIN RESULTS ON SECRECY CAPACITY 

In this section, we will use a simple formulation to present a generally applicable expression of secrecy 
capacity for stable MSCR codes. Then, we give some specific results on the secrecy capacity of stable 
MSCR codes. At last, we take the stable MSCR code as an example to verify the secrecy capacity obtained 
from information theory. 


5.1 Simple Expression of Secrecy Capacity 

Leveraging the lemmas we obtain before, we have the following theorem. 

Theorem 1. For any stable MSCR code with parameter set {n > d + t,k, d, t, a, j3, /?'}, 

= (k-h-h){a-HiS^)), (52) 

where g G G, |G| = k — li — I 2 and |E| = I 2 < h + I 2 ^ k — 1. 

Proof. Lemma and Lemma mean that, for the stable MSCR codes, we have the following expression 
of secrecy capacity 

= {k-h-l2)a-H{S^), (53) 

where |G| = fc — — I 2 and li + I 2 < k — 1. 

Lemmaindicates that, in the stable MSCR scenario, for any subset F such that |E| < A: — 1 and for 


arbitrary 31,32 G G, we have 




(54) 


From the equations (531 and (54), we naturally obtain the expression (52). 


Remark 8 The formulation {52) can be regarded as the simplest way to define the secrecy capacity of 
stable MSCR codes, since we only need to concentrate on Sg , the repair data sent from single node g, 
where g G G. 


5.2 Some Results on Secrecy Capacity 

Putting all together, we give the following result. 

Theorem 2. Given a stable MSCR code with {n > d + t,k, d, t, a, /?, /3'}, for li + I 2 < k — 1, we have 

B^^) = {k-h-l2){a-n{l3,l2)), (55) 

where 


7r(/3, ^ 2 ) = ^ 2 /?, for 


( I 2 <t < k; 

1 or t > k and d = k. 


(56) 


Proof. Lemma and Theorem directly lead to 

rC) = (k-h- h){a - I 2 P) = {k-h- l2){d -k + t- h)^, (57) 

when I 2 < t < k or when t > k and d = k. 

Remark 9 The above theorem is only applicable to stable MSCR codes. The authors in give a similar 
result in the situation when d = k and I 2 < t, while they only consider under single repair group. 
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5.3 Specific Calculation of Secrecy Capacity 


Here, we are to analyze the specific secrecy capacity of the stable MSCR code obtained in Section 3.2. 

Without loss of generality, we assume the eavesdropper can observe the content of nodes set {E = 
and the repair downloads of nodes set {F = [li + l,^i + I 2 ]}, where li + I 2 < k — 1. Thus, the 
eavesdropper has the knowledge of 

{ = {S\„S^^c\^}\t&cn[h + 1, h + l 2 ],Cc[l, n],Dc{[l, n] \ C)}} , (58) 

where C denotes the repair group, D is the set of helper nodes and C means traversing. Interestingly, 
we find that is also invariant in this stable MSCR code, while we assume it may vary with 

different repair groups. We make the calculation in detail as follows. 

First, we have = {mfgjli = 1, 2, • • • ,t;j = 1, ■ • • , ^i}, where (rui, m 2 , • • • , m^) is the original 

data packets. 

Second, we have 

^[^1+1,^+^2] 

^ • [gi, • • • ,g*_i,gi+i,- • • ,g„], [m'l, • • • • g,\i e [h + lji+ h]} ■ (59) 

= {gf ■ [mi,m2,--- ,nitf ■ [gi,--- , gi_i, g,+i, • • • ,g„]|iG [/i + 1,/i + ^ 2 ]} 

. U {[g(,- • • ,g'_i,g'+i,- - • ■ [mi,m2,- • • • g,\i G + IJi+k]}, 


where 




{gf • [mi, m2, • • • ,mtf ■ [gi,- • • ,g*_i,g,+i, • • • ,g„]|* G [h + 1,^1 +^2]} 
{[g'l, • • ■ ,g(-i,g(+i,- • • ,Snf ■ [mi,m2,- • • ,mtf ■ gi\i G [^i + l,/i + ^2]} • 


(60) 


Now, we are to verify some properties of stable MSCR codes. 
Verification 1. According to the first part of Lemma we should have 

5[G+i./i+/2] = 


(61) 


(62) 


where = [mi, m 2 , • • • , m.tY'' [gZi+i, ’ ’ ’ , Sh+h]- Since any t x t submatrix of G' is invertible 

we can directly deduce 

that naturally leads to and further verifies the first part of Lemma 

5[G+i./i+G] ^ ^[/i+i.G+G] u 5 ['i+i.G+/2] ^ (63) 


Here, it should be noted that the property _ W[i^+iu+i^] is not applicable to any stable 

MSCR codes and is only feasible in this special stable MSCR cod^ 

® Althou g h Lemma [pleads to ^Ih+i.ii+G] = glG+hii+G] u^l'i+hh+G] = } that cor¬ 
responds to equation (|63|, we cannot derive that 5[G+i3i-l-i2] _ for any stable MSCR codes. 

The reason is that ^IGTTG+G] are not independent with 5 'A+i.G+G]^ j g ^ there exists the 

intersection pattern between glb+i^i+G] ^[ii+i.G+G] gg .^^gp gg between IF[;j+i,;j+; 2 ] and 
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Verification 2. Then, we have which is the infor¬ 

mation leakage obtained by the eavesdropper. From the second part of Lemmait should be that 

= i?(VF[i./,+i2]) + |t^[i,/,+;.]) (64) 

= 7L(tF[i,,+,2])+^(4™S])- 


As we know, 

[ VF[i,/,+i 2 ] = [mi, m 2 , • • • , mt]^ • [gi. 


5 §^ 1+^2 


1 -S'['i+i’'i+'"l = {gf ■ [mi,m2 ,--- ,mt]^- [gi,--- , gi_i, g^+i, • • • ,g„]|iG [^i + 1, + ^ 2 ]} 

from which we have 


I = |W^[1A+G]) + |W^[1A+Gb 


l +^2 + l?^] 


)• 


(65) 


( 66 ) 


Because any fcx/c submatrix of G is invertible, we know that [gi, • • • , gi^+ij] and [gij+i 2 -i-i, ■ • ■ , gfc] are mu¬ 
tually independent. Based on this observation, we can derive ^^(S'lii+il'+i'fcf l^[i 4 i+* 2 ]) = -^^('S'lzi+il'+i'fef 
In addition, given the following formulations 


^[i,ii+i2\ = [mi, m 2 , • • • , mtf • [gi, • • • , Sh+h] 

4u+i2+Sf = {feUiG-- ,Si.,+i,f ,mtf -[gi.+i.+i,--- ,gfc]}. 


(67) 


we can obtain [g[^_|_i,--- ,g(^_|_/ 2 ]^ • [mi,m 2 ,--- ,mt]^ for the invertiblity of [gi,-- - ,gfc], with which 
we further derive {414144*"' = feii-Hi’ ’ ’ ’ ’ si+h]'^ ' [mi, m 2 , • • • , m*]^ • [g^+i, • • • , g„]|. That exactly 
means (4fc+'44*"'4G+/2+i'fe{) “ Thus, the second part of Lemmaj^is also verified. 

Verification 3. Finally, we can easily deduce that the size of information leakage obtained by the 
eavesdropper is precisely equal to 


f i7(VF[i.i,],^['i+i’'i+'^l) 

= i7(IFp,,+,2l) + ff(4q4’+i4]) 


( 68 ) 


where ^ 

invertible, we have 


= ih+l2)a+ Y. 

p —/ 1 +Z 2 + I 

■ [mi, m 2 , • • • , mt]"^ • gg. Because any t x t submatrix of G' is 






I 2 I 3 

tp 


if I 2 < t; 
if I 2 > t. 


Combining equations (68) and (69), we obtain, for Zi -|- ^2 < ^ — 1, 

' {k - h - l2){a - I 2 P) 


= 


0 


if I 2 < t; 
if I 2 > t, 


(69) 


(70) 


where a = (d — fc -I-1)/3 = t/3. 

As we can see, this above result is exactly one special case of our Theorem [^ when d = k. 

Remark 10 As shown in section 3.1.2, the original MSCR code given in m has poor secrecy capacity 
and may lose all the data secrecy in some cases even when I 2 = 1. In contrast, the stable MSCR code built 
from conversion apparently offers better secrecy capacity and always provides the positive secrecy capacity 
whenever I 2 < t and h + I 2 < k — 1, see eguation {70). 
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6 CONCLUSION 


In this work, we study the secrecy capacity of minimum storage cooperative regenerating codes. We 
recognize a critical detail of the repair strategy, that is, the content of repair data may vary depending 
on the choice of the repair group or the set of helper nodes, which was neglected by the previous studies 
[15] . Thereby, we introduce a new type of codes called the “stable” MSCR codes, where the repair data is 
independent of the repair groups and the sets of helper nodes. Towards it, we find the two MSCR codes 
proposed in |40I41| actually are not stable while we convert the MSCR code given in [41] to a stable one, 
which has better secrecy capacity than the original one. In addition, we utilize information theory to give 
some specific results on secrecy capacity. 

Although we present some results on data secrecy of MSCR codes, there are still many related research 
questions for further exploring. First, more examples of MSCR codes and stable MSCR codes need to 
be further explored. Second, we need to derive the characterization of secrecy capacity in more diverse 
situations than considered in this paper. 
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